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ABSTRACT 

In this paper, we propose two algorithms for solving lin- 
ear inverse problems when the observations are corrupted 
by noise. A proper data fidelity term (log-likelihood) is in- 
troduced to reflect the statistics of the noise (e.g. Gaus- 
sian, Poisson). On the other hand, as a prior, the images 
to restore are assumed to be positive and sparsely repre- 
sented in a dictionary of waveforms. Piecing together the 
data fidelity and the prior terms, the solution to the in- 
verse problem is cast as the minimization of a non-smooth 
convex functional. We establish the well-posedness of the 
optimization problem, characterize the corresponding min- 
imizers, and solve it by means of primal and primal-dual 
proximal splitting algorithms originating from the field of 
non-smooth convex optimization theory. Experimental re- 
sults on deconvolution, inpainting and denoising with some 
comparison to prior methods are also reported. 
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1. INTRODUCTION 

A lot of works have already been dedicated to linear in- 
verse problems with Gaussian noise (see [16 for a compre- 
hensive review), while linear inverse problems in presence 
of other kind of noise such as Poisson noise have attracted 
less interest, presumably because noises properties are more 
complicated to handle. Such inverse problems have how- 
ever important applications in imaging such as restoration 
(e.g. deconvolution in medical and astronomical imaging), 
or reconstruction (e.g. computerized tomography). 

Since the pioneer work for Gaussian noise by [9] , many other 
methods have appeared for managing linear inverse problem 
with sparsity regularization. But they limited to the Gaus- 
sian case. In the context of Poisson linear inverse prob- 
lems using sparsity-promoting regularization, a few recent 



algorithms have been proposed. For example, [10 stabi- 
lize the noise and proposed a family of nested schemes rely- 
ing upon proximal splitting algorithms (Forward- Backward 
and Douglas- Rachford) to solve the corresponding optimiza- 
tion problem. The work of 14] is in the same vein. These 
methods may be extended to other kind of noise. How- 
ever, nested algorithms are time-consuming since they ne- 
cessitate to sub-iterate. Using the augmented Lagrangian 
method with the alternating method of multipliers algorithm 
(ADMM), which is nothing but the Douglas- Rachford split- 
ting applied to the Fenchel-Rockafellar dual problem, [13] 
presented a deconvolution algorithm with TV and sparsity 
regularization, and [1 a denoising algorithm for multiplica- 
tive noise. This scheme however necessitates to solve a least- 
square problem which can be done explicitly only in some 
cases. 

In this paper, we propose a framework for solving linear 
inverse problems when the observations are corrupted by 
noise. In order to form the data fidelity term, we take the 
exact likelihood. As a prior, the images to restore are as- 
sumed to be positive and sparsely represented in a dictio- 
nary of atoms. The solution to the inverse problem is cast 
as the minimization of a non-smooth convex functional, for 
which we prove well-posedness of the optimization problem, 
characterize the corresponding minimizers, and solve them 
by means of primal and primal-dual proximal splitting al- 
gorithms originating from the realm of non-smooth convex 
optimization theory. Convergence of the algorithms is also 
shown. Experimental results and comparison to other algo- 
rithms on deconvolution are finally conducted. 

Notation and terminology 

Let % a real Hilbert space, here a finite dimensional vector 
subspace of R^. We denote by ||.|| the norm associated with 
the inner product in and I is the identity operator on 
ll-llp ^ 1 is the Ip norm, x and a are respectively re- 
ordered vectors of image samples and transform coefficients. 
We denote by riC the relative interior of a convex set C. A 
real- valued function / is coercive, if lim||a,||_^+oo / {x) — +(X), 
and is proper if its domain is non-empty dom / = {x G | 
f{x) < +00} 7^ 0. ro('H) is the class of all proper lower semi- 
continuous (Isc) convex functions from Ti to (— cxo, +00]. We 
denote by |||M||| = max^^o ^^e spectral norm of the 

linear operator M, and ker(M) := {x G H : Mx = 0, x 7^ 0} 
its kernel. 



Let X G be an ^/n x ^yn image, x can be written as 



the superposition of elementary atoms Lp^ parameterized by 
7 G X such that x — ^^^x^i^i ~ l-^l — L ^ n. 

We denote hy ^ : Ti' ^ Ti the dictionary (typicaUy a frame 
of , whose columns are the atoms all normalized to a unit 
^2-norm 

2. PROBLEM STATEMENT 

Consider the image formation model where an input image 
of n pixels x is indirectly observed through the action of a 
bounded linear operator H : H ^ AC, and contaminated by 
a noise s through a composition operator (e.g. addition), 



y ^ Hx £ . 



fl) 



The linear inverse problem at hand is to reconstruct x from 
the observed image y. 

A natural way to attack this problem would be to adopt a 
maximum a posteriori (MAP) bayesian framework with an 
appropriate likelihood function — the distribution of the ob- 
served data y given an original x — reflecting the statistics 
of the noise. As a prior, the image is supposed to be eco- 
nomically (sparsely) represented in a pre-chosen dictionary 
^ as measured by a sparsity-promoting penalty ^ supposed 
throughout to be convex but non-smooth, e.g. the ii norm. 



data fidelity term: 

n 

/poisson : ?7 G M"" ^ ^ /p(?7H), 



(6) 



if2/W = 0, /p(ryH) 



(-y\i]\og{r]\i])+r]\i] if ry^ > 0, 
]^+oo otherwise, 

h[i] if ^ [0,+oo), 
l^+oo otherwise. 



Using classical results from convex theory, we can show that. 



Proposition 2. /poisson is a proper, convex and Is c func- 
tion, /poisson is strictly convex if and only if\/i G {1, . . . , n}, y[i 
0. 



2.3 Multiplicative noise 

We consider the case without linear operator and as in [1] 
with a M-look full developed speckle noise. 



y = X£, £-r(M, 1/M) 



(7) 



2.1 Gaussian noise case 

For Gaussian noise, we consider the following formation model, 
2/ = Hx + £ , (2) 

where s ^ A/'(0, cr^). 

From the probability density function, the negative log-likelihood 
writes: 

/Gaussian V ^ ^ \\V - VWl /(^^^) ' (3) 

From this function, we can directly derive the following re- 
sult. 



Proposition 1. /caussian is a proper, strictly convex and 
Isc function. 

2.2 Poisson noise case 

The observed image is then a discrete collection of counts 
y — {y[i])^^i^ri which are bounded, i.e. y ^ loo- Each 
count y[i\ is a realization of an independent Poisson random 
variable with a mean (Hx)^. Formally, this writes in a vector 
form as 

y ~ 7^(Hx) . (4) 



From the probability density function of a Poisson random 
variable, the likelihood writes: 

Taking the negative log-likelihood, we arrive at the following 



In order to simplify the problem, the logarithm of the obser- 
vation is considered, \og{y) — log(x) + log(£) — z -\- uj. And 
in 1 , the authors proof that the anti log-likelihood yields, 

n 

/Multi : ryGR"H^M^(zH+exp(log(2/H)-zW) . (8) 

i=l 

Using classical results from convex theory, we can directly 
derive. 

Proposition 3. /Muiti is a proper, strictly convex and Isc 
function. 

2.4 Optimization problem 

Our aim is then to solve the following optimization problems, 
under a synthesis-type sparsity prioiQ, 

argmin J(a), 

J : a 1-^ /i o H o ^(a) + 7^(a) + o ^(a) . 

The data fidelity term /i reflect the noise statistics, the 
penalty function \ a ^ X^^Lo V^^'^W) positive, additive, 
and chosen to enforce sparsity, 7 > is a regularization 
parameter and ic is the indicator function of the convex set 
C (e.g. the positive orthant for Poissonian data). 

For the rest of the paper, we assume that /i is a proper, con- 
vex and Isc function, i.e. /i G ro('H). This is true for many 
kind of noises including Poisson, Gaussian, Laplacian. . . (see 
[3] for others examples). 



^Our framework and algorithms extend to an analysis- type 
prior just as well. 



From the objective in ( |P/i,7,^[ ), we get the following, 
Proposition 4. 

(i) fi is a convex functions and so are /loH and /loHo^. 

(a) Suppose that fi is strictly convex, then /i o H o ^ re- 
mains strictly convex if^ is an orthobasis anc? ker(H) = 
0. 

(Hi) Suppose that (0, +00) n H([0,+oo)) / 0. Then J G 

2.5 Well-posedness of ( |P/,,^,^[ ) 

Let M be the set of minimizers of problem ( |P/i,7,-0[ )- Sup- 
pose that ^ is coercive. Thus J is coercive. Therefore, the 
following holds: 



As with multiplicative noise /Muiti involves the exponential, 
we need the W-Lambert function [8 in order to derive a 
closed form of the proximity operator. 

Lemma 10. Let y be the observations, the proximity op- 
erator associated to /iviuiti is, 

prox^^^ X = X-/3M-W (-/3Mexp(x - log(^) - pM)) , 

(11) 

where W is the W-Lambert function. 

We now turn to prox^^ which is given by Lemma [3 and the 
following result: 



Proposition 

Existence: 



|P/i,7,-0|) least one solution, i.e. M ^ 



Uniqueness: ( |P/i,7,-0[ ) has a unique solution if ^ is 
strictly convex, or under (ii) of Proposition^ 



Theorem 11 ([12 ). Suppose that \/ i: (i) ij^i is convex 
even- symmetric, non-negative and non- decreasing on R^^ 
and ipi(0) = 0; (ii) ijji is twice differentiable onR\{0}; (Hi) 
ipi is continuous on M, and admits a positive right derivative 
at zero '0^^(O) = lim^^o+ ^'/^^^ > 0. Then, the proximity 
operator prox5^.(/3) = Q^(/3) has exactly one continuous so- 



ITERATIVE MINIMIZATION ALGORITHM^^^^^ decoupled m each coordinate 



3. 

3.1 Proximal calculus 

We are now ready to describe the proximal splitting algo- 
rithms to solve ( |P/i,7,-0[ ) • At the heart of the splitting frame- 
work is the notion of proximity operator. 

Definition 6 {^^) . Let F ^ T q{H) . Then, for every 
X ^ H, the function y F{y) + \\x — yW^ /2 achieves its 
infimum at a unique point denoted by prox^ x. The operator 
prox^ : Ti ^ Ti thus defined is the proximity operator of F. 

Then, the proximity operator of the indicator function of a 
convex set is merely its orthogonal projector. One important 
property of this operator is the separability property: 



Lemma 7 (H). Let Fk G FoCH), k G {1, 



,K} 



letG : {xk)i^k^K ^ JZk^kixk)- Then^ioxQ = (prox^^)i^fc^K. 

For Gaussian noise, we can easily prove that with /i as de- 
fined in (|3|, 



Lemma 8. Let y be the observation, the proximity op 
erator associated to /oaussian (i-e. the Gaussian anti log 
likelihood) is, 

/3y + cr^x 

prox^ r . X — 

^ A-' J Graussian 



(9) 



The following result can be proved easily by solving the prox- 
imal optimization problem in Definition [6] with /i as defined 
in (|6]), see also [5 . 



Lemma 9. Let y be the count map (i.e. the 

tions), the proximity operator associated to /poisson 
Poisson anti log-likelihood) is. 



prox^^p 



x[i] - /3 + y/{x[i]-p)^ + 4fy[i] 



observa- 
(i.e. the 



(10) 



if l/?WK<5V.+(0) 

if \m>sA+{o) 



(12) 



Among the most popular penalty functions ipi satisfying the 
above requirements, we have ipi{a[i]) = \oi[i] \ ,V i, in which 
case the associated proximity operator is soft-thresholding, 
denoted ST in the sequel. 

3.2 Splitting on the primal problem 

3.2.1 Splitting for sums of convex functions 
Suppose that the objective to be minimized can be expressed 
as the sum of K functions in Fo('H), verifying domain qual- 
ification conditions: 



argmm 



Fix) 



(13) 



Proximal splitting methods for solving (|13p are iterative al- 
gorithms which may evaluate the individual proximity oper- 
ators prox^^ , supposed to have an explicit convenient struc- 
ture, but never proximity operators of sums of the Fk- 

Splitting algorithms have an extensive literature since the 
1970's, where the case K = 2 predominates. Usually, split- 
ting algorithms handling K > 2 have either explicitly or 
implicitly relied on reduction of (pT|) to the case K = 2 in 
the product space . For instance, applying the Douglas- 
Rachford splitting to the reduced form produces Spingarn's 
method, which performs independent proximal steps on each 
Fk, and then computes the next iterate by essentially aver- 
aging the individual proximity operators. The scheme de- 
scribed in [6] is very similar in spirit to Spingarn's method, 
with some refinements. 



3.2.2 Application to noisy inverse problems 
Problem ( |P/i,7,v;P is amenable to the form ([T3|), by wisely 
introducing auxiliary variables. As ( |P/i,7,-0P involves two 
linear operators and H), we need two of them, that we 



define as xi = and X2 — Hxi. The idea is to get rid 
of the composition of ^ and H. Let the two linear opera- 
tors Li = [I - *]_and_L2 = [-H I 0]. Then, the 
optimization problem ( |P/i,7,V;[ ) can be equivalent ly written: 

argmin /i (^2) + ^c(a:l) + 7^(a) + (14) 

^kerLi(a:i,X2,a) + ^kerL2(3:^l,^2,a) ■ (15) 

Notice that in our case K = 3 hy virtue of separability of 
the proximity operator of C in xi, X2 and a; see Lemma [3 

Algorithm 1: Primal scheme for solving ( |P/i,7,V; >• 
Parameters: The observed image y, the dictionary ^, 
number of iterations A^iter, > and regularization 
parameter 7 > 0. 
Initialization: 

Vz G {1, 2, 3},^ p(o,^) = (0, 0, 0)^. zo = (0, 0, 0)^. 
Main iteration: 
For t = to A^iter - 1, 

• Data fidelity fLemmas [8l [9l and [T0|) : 

^(t,i)[l] =prox^/,/3(P(t,i)[l]). 

• Sparsity-penalty (Lemma [TT)): 
^(t,i)[2] = prox^^^/3(p(,,i)[2]). 

• Positivity constraint: ^(t,i)[3] = ^c(p(t,i) [3]). 

• Auxiliary constraints with Li and L2: (Lemma [12)): 

^(t,2) = "PkerLi (P(t,2)), 'f(t,3) = ^ker L2 (P(t ,3) ) • 

• Average the proximity operators: 

= (^(t,l) + C{t,2) + ^(t,3))/3. 

• Choose Ot e]0,2[. 

• Update the components: 

Vi G {1,2,3}, P(t+i,i) =P(t,i) +^t{2^t - zt -^(^,^))■ 

• Update the coefficients estimate: 

zt+i = zt + Ot[^t - Zt). 

End main iteration 

Output: Reconstructed image x^ = ^A^iterM- 



The proximity operators of F and ^ are easily accessible 
through Lemmas [51 [H [10] and [TT] The projector onto C 
is trivial for most of the case (e.g. positive orthant, closed 
interval). It remains now to compute the projector on ker L^, 
i — 1,2, which by well-known linear algebra arguments, is 
obtained from the projector onto the image of L*. 

Lemma 12. The proximity operator associated to ZkerL^ 

is 

^erL, =I-L*(L.oL*)-'L. . (16) 

The inverse in the expression of T^kerLi is (I + ^o^^)"^ can 
be computed efficiently when ^ is a tight frame. Similarly, 
for L2, the inverse writes (I+HoH*)""*^, and its computation 
can be done in the domain where H is diagonal; e.g. Fourier 
for convolution or pixel domain for mask. 

Finally, the main steps of our primal scheme are summarized 
in Algorithm[T] Its convergence is a corollary of [6] [Theorem 3.4] . 

Proposition 13. Let {zt)t^n be a sequence generated by 
Algorithm[J\ Suppose that Proposition^ (Hi) is verified, and 



X^tgi^^t(2 — Ot) = +00. Then {zt)teN converges to a (non- 
strict) global minimizer of ( |P/^,^,^| ). 

3.2.3 Splitting on the dual: Primal-dual algorithm 
Our problem (jP/T^T^ rewritten in the form, 

argmin F o K(a) + 7^(a) (17) 
/ H o ^ \ 

where now K = I ^ 1 and F : (xi,X2) ^ fi{xi) + 

ic{x2). Again, one may notice that the proximity operator 
of F can be directly computed using the separability in xi 
and X2. 

Recently, a primal-dual scheme, which turns to be a pre- 
conditioned version of ADMM, to minimize objectives of the 
form (|17|) was proposed in [2 . Transposed to our setting, 
this scheme gives the steps summarized in Algorithm [2] 

Adapting the arguments of [2 , convergence of the sequence 
{at)ten generated by Algorithm [2] is ensured. 

Proposition 14. Suppose that Proposition^ (Hi) holds. 
Let C = III ^ If (1 + |||H|f ), choose r > and a such that 
ar(^ < 1, and let {at)teR cls defined by Algorithmic Then, 
{a)teN converges to a (non-strict) global minimizer ( |P/i,7,V;| ) 
at the rate 0{l/t) on the restricted duality gap. 

3.3 Discussion 

Algorithm [1] and [2] share some similarities, but exhibit also 
important differences. For instance, the primal-dual algo- 
rithm enjoys a convergence rate that is not known for the 
primal algorithm. Furthermore, the latter necessitates two 
operator inversions that can only be done efficiently for some 
^ and H, while the former involves only application of these 
linear operators and their adjoints. Consequently, Algo- 
rithm [2] can virtually handle any inverse problem with a 
bounded linear H. In case where the inverses can be done 
efficiently, e.g. deconvolution with a tight frame, both algo- 
rithms have comparable computational burden. In general, 
if other regularizations/constraints are imposed on the solu- 
tion, in the form of additional proper Isc convex terms that 
would appear in ( |P/i,7,V;| ), both algorithms still apply by 
introducing wisely chosen auxiliary variables. 

4. EXPERIMENTAL RESULTS 

4.1 Deconvolution under Poisson noise 

Our algorithms were applied to deconvolution. In all ex- 
periments, ^ was the ^i-norm. Table [T] summarizes the 
mean absolute error (MAE) and the execution times for 
an astronomical image, where the dictionary consisted of 
the wavelet transform and the PSF was that of the Hubble 
telescope. Our algorithms were compared to state-of-the- 
art alternatives in the literature. In summary, ffexibility of 
our framework and the fact that Poisson noise was handled 
properly, demonstrate the capabilities of our approach, and 
allow our algorithms to compare very favorably with other 
competitors. The computational burden of our approaches 
is also among the lowest, typically faster than the PIDAL 
algorithm. Fig. [T] displays the objective as a function of the 
iteration number and times (in s). We can clearly see that 
Algorithm 2 converges faster than Algorithm 1. 



Algorithm 2: Primal-dual scheme for solving ( |P/i,7,v;| )- 
Parameters: The observed image the dictionary ^, 
number of iterations A^iter, proximal steps a > and r > 0, 
and regularization parameter 7 > 0. 
Initialization: 

— oiQ — — rjQ — 0. 
Main iteration: 

For t = to A^iter - 1, 

• Data fidelity fLemmas [8l [9l and [TQ|) : 

^t+i = (I-crprox^^/^)(^t/cr + Ho*at). 

• Positivity constraint: rft+i = (I — crVc){rit/cr + ^at)- 

• Sparsity-penalty (Lemma [11]): 

ttt+i = prox^^^ [at - r^^ (H*.Jt+i + ?7t+i)) • 

• Update the coefficients estimate: a^+i = 2at+i — at 

End main iteration 

Output: Reconstructed image x'^ = ^aN^^er- 
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RL-TV inj 


StabG Uoj 


PIDAL-FS ^13' 


MAE 


63.5 


52.8 


43 


43.6 


Times 


230s 


4.3s 


311s 


342s 




Alg.liJ 


Alg.l2| 




MAE 


46 


43.6 


Times 


183s 


154s 



Table 1: MAE and execution times for the deconvo- 
lution of the sky image. 

4.2 Inpainting with Gaussian noise 

We also applied our algorithms to inpainting with Gaussian 
noise. In all experiments, ^ was the ^i-norm. Fig [2] sum- 
marizes the results with the PSNR and the execution times 
for the Cameraman, where the dictionary consisted of the 
wavelet transform and the mask was create from a random 
process (here with about 34% of missing pixels). Notice 
that both algorithms leads to the same solution which gives 
a good reconstruction of the image. Fig. [3] displays the ob- 
jective as a function of the iteration number and times (in 
s). Again, we can clearly see that Algorithm 2 converges 
faster than Algorithm 1. 

4.3 Denoising with Multiplicative noise 

As we work on the logarithm the problem f see 12.31 the final 
estimate for each algorithm is given by taking the exponen- 
tial of the result. In all experiments, ^ was the -^i-norm. 
The Barbara image was set to a maximal intensity of 30 




Figure 1: Objective function for deconvolution un- 
der Poisson noise in function if iterations (left) and 
times (right). 




Alg.[T](PSNR = 25. 



Alg.|2](PSNR = 25.8) 



Figure 2: Inpainting results for the Cameraman us- 
ing our two algorithms. 
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Figure 3: Objective function for inpainting with 
Gaussian noise in function if iterations (left) and 
times (right). 



and the minimal to a non-zero value in order to avoid issues 
with the logarithm. The noise was added using M = 10 
which leads to a medium level of noise. Fig [H summarizes 
the results with the MAE and the execution times for Bar- 
bara, where the dictionary consisted of the curvelets trans- 
form. Our methods give correct reconstruction of the image. 
Fig. [5] displays the objective as a function of the iteration 
number and times (in s). Again, we can clearly see that 
Algorithm 2 converges faster than Algorithm 1. 

5. CONCLUSION 

In this paper, we proposed two provably convergent algo- 
rithms for solving the linear inverse problems with a sparsity 
prior. The primal-dual proximal splitting algorithm seems 
to perform better in terms of convergence speed than the pri- 
mal one. Moreover, its computational burden is lower than 
most comparable of state-of-art methods. Inverse problems 
with multiplicative noise does not enter currently in this 
framework, we will consider its adaptation to such problems 
in future work. 
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